Zero-Shot Recognition via Structured Prediction
نویسندگان
چکیده
We develop a novel method for zero shot learning (ZSL) based on test-time adaptation of similarity functions learned using training data. Existing methods exclusively employ source-domain side information for recognizing unseen classes during test time. We show that for batch-mode applications, accuracy can be significantly improved by adapting these predictors to the observed test-time target-domain ensemble. We develop a novel structured prediction method for maximum a posteriori (MAP) estimation, where parameters account for test-time domain shift from what is predicted primarily using source domain information. We propose a Gaussian parameterization for the MAP problem and derive an efficient structure prediction algorithm. Empirically we test our method on four popular benchmark image datasets for ZSL, and show significant improvement over the state-of-the-art, on average, by 11.50% and 30.12% in terms of accuracy for recognition and mean average precision (mAP) for retrieval, respectively.
منابع مشابه
Zero-Shot Activity Recognition with Verb Attribute Induction
In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs. For example, the verb “salute” has several properties, such as being a light movement, a social act, and short in duration. We use these attributes as the internal mapping between visual and textual representations to reason about a previously unseen action....
متن کاملImproving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication
This paper addresses the task of zero-shot image classification. The key contribution of the proposed approach is to control the semantic embedding of images – one of the main ingredients of zero-shot learning – by formulating it as a metric learning problem. The optimized empirical criterion associates two types of sub-task constraints: metric discriminating capacity and accurate attribute pre...
متن کاملMulti-Label Zero-Shot Human Action Recognition via Joint Latent Embedding
Human action recognition refers to automatic recognizing human actions from a video clip, which is one of the most challenging tasks in computer vision. Due to the fact that annotating video data is laborious and timeconsuming, most of the existing works in human action recognition are limited to a number of small scale benchmark datasets where there are a small number of video clips associated...
متن کاملAlternative Semantic Representations for Zero-Shot Human Action Recognition
A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost...
متن کاملLink the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision
In this paper, we study learning visual classifiers from unstructured text descriptions at part precision with no training images. We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations. For instance, this learning process enables terms like “beak” to be sparsely linked to the v...
متن کامل